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We study the fundamental limits on learning latent community structure in dynamic networks. 
Specifically, we study dynamic stochastic block models where nodes change their community mem¬ 
bership over time, but where edges are generated independently at each time step. In this setting 
(which is a special case of several existing models), we are able to derive the detectability threshold 
exactly, as a function of the rate of change and the strength of the communities. Below this thresh¬ 
old, we claim that no algorithm can identify the communities better than chance. We then give 
two algorithms that are optimal in the sense that they succeed all the way down to this limit. The 
first uses belief propagation (BP), which gives asymptotically optimal accuracy, and the second is 
a fast spectral clustering algorithm, based on linearizing the BP equations. We verify our analytic 
and algorithmic results via numerical simulation, and close with a brief discussion of extensions and 
open questions. 


Relational or interaction variables are common feature of modern data sets, and these are often represented as 
a network. Examples include friendships or communication within a social network, regulatory interactions among 
genes, transportation between cities, and relations or hyperlinks in information systems. Many, perhaps most of these 
systems are also dynamic in nature, and their evolving structure is commonly represented as a sequence of graphs [D- 
[5]. Recently, a variety of techniques have been developed for automatically detecting communities—a task that is 
similar to traditional clustering [9], but on graphs—in these dynamic networks. These techniques include variants of 
multilayer or temporal modularity optimization [iiiiniiii], non-negative matrix or tensor factorization niaiaiiiis], 
minimum description length [Til ITS] . and probabilistic models HlZllIlin]- See Refs, [mill] for reviews. Despite 
these advances, relatively little is known about their optimality or the fundamental difficulty of detecting community 
structure in dynamic networks. In this paper, we derive a mathematically precise threshold on the detectability of 
communities in dynamic networks and give two algorithms that are optimal in the sense that they succeed all the way 
down to this threshold. 

Community detection in dynamic networks inherits many of the challenges of community detection in static net¬ 
works, including learning the number of communities, their sizes and node membership, and the pattern of connections 
among communities, e.g., assortative, disassortative, core-periphery, etc. It also poses new challenges, because both 
the network edges and the community memberships may evolve over time. A common approach is to simply take the 
union of dynamic graphs over a certain time window, and treat the resulting graph with techniques from static network 
analysis [T], thereby ignoring the dynamics within the window. Here, we explicitly model the dynamic nature of these 
networks and the way community memberships change over time, integrating information about the communities in 
an optimal way. 

Our approach relies on probabilistic generative models, which can be used to learn latent community structure in 
real networks via Bayesian inference and to generate synthetic networks with known structure that can be used as 
benchmarks. A number of such models have recently been proposed for detecting communities in dynamic networks |U 
ITT] , including those based on the stochastic block model (SBM) (THl [H] and its mixed membership counterpart [7]. 
Indeed, the variant of the stochastic block model |23l [24] we analyze here is a special case of some of these models: 
namely, where nodes change their community membership over time, but where edges are generated independently 
at each time step. As a result, the network of connections between nodes at different times is locally treelike, which 
makes a belief-propagation approach asymptotically optimal and allows us to compute the detectability threshold 
exactly. 

In static networks, it has recently been shown that there exists a phase transition in the detectability of communi¬ 
ties |25II28j such that below the transition no algorithm can recover the true communities better than chance (for two 
groups of equal size) but that efficient algorithms exist above it. Here, we generalize this result to dynamic networks, 
deriving a mathematically precise expression that describes where the detectability transition occurs as a function of 
both the strength of the communities and how quickly their membership is changing. When temporal correlations in 
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community membership are present, we show that community detection in dynamic networks improves substantially 
over detection in static networks (or in a dynamic network where we cluster each graph independently). 

Finally, we give two principled and efficient algorithms for community detection in dynamic networks. Specifically, 
we use belief propagation (BP) to pass messages between neighbors both within a given graph and between time- 
adjacent graphs to integrate information over the network’s history in an optimal way. We then linearize BP to obtain 
a spectral algorithm, based on a dynamic version of the non-backtracking matrix EaED]. We show experimentally 
that these algorithms can accurately recover the true community structure in dynamic networks all the way down to 
the threshold. 


I. A DYNAMIC STOCHASTIC BLOCK MODEL 

The stochastic block model (SBM) is a classic model of community structure in static networks. Here, we use a 
variant of the SBM in which the community labels of nodes change over time, but where edges are independent, which 
is a special case of several models previously introduced for community detection in dynamic networks [HITIIIMIH]. 
Crucially, our variant captures the important behavior of changing community labels and is analytically tractable. 

Under the SBM a graph G = {V, E) is generated as follows. Using a prior distribution over k group or community 
labels, we assign each of the n nodes z e U to a group gi. We then generate the edges E according to the probability 
specified by a fc x fc community interaction matrix p and the group assignments g. In the sparse case, where |F1| = 0{n ), 
the resulting network is locally tree like and the number of edges between groups is Poisson distributed with parameter 

^rs — ^Prs ■ 

In a dynamic network, we have a sequence of graphs G{t) = {V,E{t)) with 0 < t < T, where each graph has its 
own group assignment vector {gi{t) | z S U, t G {1,..., T}}. To generate each such assignment, we draw gi(0) from the 
prior, where each node has probability of being in community 1 < r < A:. With probability rj, each node keeps its 
label from one time step to the next gi(t) = gi{t — 1), and otherwise it chooses a new label gi{t) from the prior qj.. 
Formally, the transition probability for community memberships is 

P{g{t) I g{t - 1)) = n + (1 - ?7)<?si(t)) , (1) 

i 

where 5a,h = 1 if a = 5 and 0 otherwise. The edges if(t) are then generated independently for each t according to 
the community interaction matrix p, by connecting each pair of nodes at the same time i{t) and j{t) with probability 
Pgi(t),gj(t)- Note that while the group assignments may change over time, the matrixp remains constant. Subsequently, 
we use to denote the adjacency matrix for the graph {V,E{t)) at time t, and to denote the diagonal matrix 
of node degrees at time t, i.e. Duv = Suv J2w ^uw- 

At successive times in this model, edges are correlated only through the group assignments {^(t)}. Given these, 
the full likelihood of a graph sequence under this dynamic SBM is 

Pi{Eit)}Agit)}\P^v) = Pi{g{t)})Y[i n Pg^it).gJ{t) n I ’ (2) 

*=o J 


where P({g(t)}) = P(5(0)) nLiI - !))■ 

For our subsequent analysis, we focus on the common choices of a uniform prior q^ = 1/k, and where Crs = np^s has 
two distinct entries: Crs = Cm A r = s and Crs = Cout if r ^ s. In this setting the average degree of each graph is then 
c = \[cin -b (fc — l)Cout]- We are interested in the sparse regime where c = 0(1), because most real-world networks 
of interest are sparse (e.g., the Facebook social network), and sparsity allows us to carry out asymptotically optimal 
inference. Note that the case where every group has distinct average degrees is easier than the equal-average-degree 
case that we consider, because distinct average degrees give prior information about group memberships. 


II. THE DETECTABILITY THRESHOLD IN DYNAMIC NETWORKS 

The fundamental question we now consider is, under what conditions can we detect, better than chance, the correct 
time-evolving labeling of the latent communities in this model? 

Previous work on community detection in static networks has shown that there exists a sharp threshold below which 
no algorithm can perform better than chance in recovering the latent community structure |251126] , at least in the 
case k = 2. This threshold occurs at positive values of the difference in the internal and external group connection 
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probabilities, meaning that the community structure may still exist, but is undetectable. In terms of the SBM’s 
parameters, this phase transition occurs at 

|cin Cout I — ky/c . (3) 


In a dynamic network where community memberships correlate across time, we will exploit these correlations to 
improve upon the static detectability threshold. In the worst case, when these temporal correlations are absent, 
i.e., ry = 0, we should do no worse than the static threshold. To facilitate our analysis, we define an extended 
graph structure, called a spatiotemporal graph, in which we take G{t) and add special “temporal” edges that connect 
each node i{t) with its time-adjacent versions i{t — 1) and i{t + 1). Under our model, the “spatial” edges E{t) are 
independent and sparse, implying that this spatiotemporal graph is locally treelike. 

Consider a particular node i(t) as n —> oo and T —>■ oo. Moving outward in space and time, inference becomes a 
tree reconstruction problem, with stochastic transition matrix a, along each spatial edge 

^ = AI+(1-A)^ , (4) 

where I is the identity matrix, J is the matrix of all Is, and 


A = 


Qn ^out 

kc 


Similarly, along each temporal edge we have a stochastic matrix 

T = r]l + {l-r])j . 


(5) 

( 6 ) 


Thus, moving along a spatial or temporal edge copies a community label with probability A or ry respectively, and 
otherwise randomizes it according to the prior. That is, these edges multiply the distribution of labels by the stochastic 
matrices a and r, whose eigenvalues are A and ry, other than the trivial eigenvalue 1 corresponding to the uniform 
distribution. 

Since each node in the spatiotemporal graph has Poi(c) (Poisson-distributed random variable with mean c) spatial 
edges but exactly two temporal edges, the tree is generated by a two-type branching process. Each spatial edge gives 
rise to two temporal edges (to each of the time-adjacent versions of its end point), and each temporal edge gives rise 
to one temporal edge (continuing in the same direction in time), and both give rise to Poi(c) spatial edges. Thus the 
matrix describing the expected number of children (where we multiply a column vector of populations on the left) 

is ^2 ■ Using the results of Ref. [3l], the detectability threshold occurs when the largest eigenvalue of matrix 

\ 

I exceeds unity, which yields 


f cA^ cA' 
\2ri^ rj^ 


cA^ > 


1 — 7y^ 

1 -I- ry^ 


(7) 


When ?y = 0, i.e., when there is no temporal correlation in community assignments over time, Eq. Q recovers the 
static detectability threshold cA^ > 1, which is equivalent to Eq. ([^. On the other hand, when ry = 1, i.e., when the 
community assignments are fixed across time, we may simply integrate the graph over T, making it arbitrarily dense. 
We then have detectability for any A > 0, implying that any amount of community structure can be detected. At 
intermediate values of ry, the detectability threshold falls between these two extremes. 

This analysis corresponds to robust reconstruction on trees, where we are given noisy information at the leaves of a 
tree and we want to propagate this information to the root m- For k = 2 groups, it is known rigorously in the static 
case [26] that detecting the communities below this bound is information-theoretically impossible. We conjecture that 
the same is true in the dynamic case. For fc > 4 groups, it has been conjectured [25] that it is information-theoretically 
possible to succeed beyond the Kesten-Stigum bound, but that doing so takes exponential time. 


III. BAYESIAN INFERENCE OF THE MODEL 


Given an observed graph sequence G{t), we use Bayesian inference to learn the posterior distribution of latent 
community assignments: 


P{{9{i)}\{E{t)},P,il) 


P{{E{t)},{g{t))\p,p) 
E{7(t)} {7(0}b> v)' 


( 8 ) 
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FIG. 1. A schematic representation of belief propagation messages (see Eqs.([^ and (lOl) being passed along spatial and 
temporal edges in a spatiotemporal graph. 


This distribution is hard to compute in general because the summation runs over an exponential number of terms. 
However, when the spatiotemporal graph is sparse, as generated by our model, we may make a controlled Bethe 
approximation (also known as belief propagation (BP) in machine learning and as the “cavity method” in statistical 
physics) that allows us to carry out Bayesian inference in an efficient and asymptotically optimal way. We now describe 
a BP algorithm for learning our model form data, which we then linearize to obtain a fast spectral approach, based 
on a dynamic version of the non-backtracking matrix. This yields two inference algorithms that perform accurately 
all the way down to the transition. 


A. Belief propagation 


Instead of inferring the joint posterior distribution, we use belief propagation to compute posterior marginal prob¬ 
abilities of node labels {/is(i)} over time. Belief propagation assumes conditional independence of these marginals, 
which is exact when the graph is a tree and is a good approximation when the graph is locally-tree like, as in our 
spatiotemporal graph. In our setting, nodes update their current belief about marginals according to the marginals of 
both their spatial and temporal neighbors. That is, we define two types of messages: spatial messages that pass along 
spatial edges and temporal messages that pass along temporal edges. Fig. illustrates this message passing scheme 
for a spatiotemporal graph. 

A spatial message p,\^^(t) gives the marginal probability of a node i at time t being in community r, when we 
consider node j to be absent at time t. This message is computed as 

^ \ U ) l-.(i,l)(^E{t) S 

X + (1 - ry) E X n E(1 - , (9) 


where Z’'^^(t) is the normalization. The temporal message (or ^^) represents the marginal 

probability of node i at time t being in community r, when we consider node i to be absent at time t -I- 1 (or at t — 1) 
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and has a similar form: 


- ^i(t)^i(t±l) I Wr Qu j 

X n ^ n ■ 

e-.(i,e)eE(t) s e-.{i,e)<fE(t) s 


( 10 ) 


When 1 = 0 or 1 = r, we remove the term corresponding to the temporal edge coming from outside the domain 
of t. Furthermore, following past work on BP for the static SBM [53[32], we exploit these networks’ sparsity to 
reduce the computational complexity of the spatial updates at the cost of introducing o(^) corrections in sparse 


graphs. Specifically, we let be the same for all of £’s non-neighbors i. We then model the effects of all such 

non-edges as an adaptive external field on each node, which depends on the current estimated marginals That 

is, we let ~ ^ where hr{t) = ^ J2s '^rs i which has the effect of preventing belief 

propagation from putting all the nodes at a given time into the same community. The adaptive fields only need to 
be updated after each BP iteration. This approximation yields a significant improvement in efficiency, reducing the 
computational complexity to be proportional to total number of edges in the spatiotemporal graph cnT, rather than 
n^T. 

Once the BP messages converge, we compute the marginal probability p.(.(t) that node i belongs r at time t. This 
is identical to ([^ and (10), except that we take all incoming edges into account. We then obtain a partition by 
marginalization, which assigns each node to its most-likely group: 


ft(t) = argmax^/i(.(t) . 


( 11 ) 


It is well known in Bayesian inference [33] that if the marginals are exact, then the marginalized partition is the 
optimal estimator of the latent community labels. Because spatiotemporal graphs under our model are sparse, we 
know that with n —> oo, the marginals given by BP are asymptotically correct. Thus, our BP algorithm succeeds all 
the way down to the detectability threshold given by Eq. ([^, and gives an asymptotically optimal partition in terms 
of accuracy. 


B. Spectral clustering 


The BP equations described above can be linearized to obtain a fast spectral approach for detecting community 
structure in dynamic networks. It is easy to verify that in our setting, when = I/fc, the average degree in each 
group is c. This implies that BP equations will always have a solution 


H'r r^s 1 1 


( 12 ) 


which we call a factorized fixed point. This fixed point only reflects the permutation symmetry in the system, and 
could be unstable due to random perturbations. If we use the correct parameters in BP equations, i.e., the same 
parameter used to generate the observed network, then in the language of physics we would say that system is in the 
Nishimori line [33]. That is, if the BP messages deviate from the factorized solution, then they are correlated with 
the latent community labels and we say that there is no spin glass phase in system |33| . This allows us to simplify the 
BP equations by studying how the messages deviate from the factorized solution, which results in a linearized version 
of BP. In the static SBM, this linearization is equivalent to a spectral clustering algorithm using the non-backtracking 
matrix [33] . 

To do this, we rewrite the BP messages as the uniform fixed point ^ plus deviations away from it. The 

vector of deviations is given by 


^ — ] Pi ) P2 I 


> Pk ( 


1 1 1 
k'k'"''k 


and the linearized BP equations are then 

e{t)Gdi(t)\j{t) 


( 13 ) 
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FIG. 2. (left) Spectrum (in the complex plane) of matrix B' for a network generated by our model with n = 300, c = 3, k = 2 
groups and (e, J?) = (0.05,0.5). The complex eigenvalues are circumscribed by the circle, (right) Overlap as a function of e for 
different values of r; (given in the legend). The detectability thresholds for each choice of t], according to Eq. Q, are shown as 
vertical lines in lower panel, and the hatched area shows the region of detectability for static networks |25| . Each data point 
is the average of 100 instances of dynamic networks from our model, with n = 512, T = 40, and k — 2 groups, with average 
degree c = 16. 


where di(t) means neighbors of i(t), U and V denote derivatives evaluated at the factorized fixed point: 

Usr — 





T/ 

(t) 

i 

1 


k 

1 

k 



1 

k 


(14) 


Solving Eq. (131 amounts to finding eigenvectors of the Jacobian matrix B composed of derivatives of the BP messages. 
However, the size of the matrix B is (cn x T + 2n x (T — 1)) , which is relatively large for an eigenvector problem. 
Using the non-backtracking matrix approach [29) . we convert this problem into a smaller eigenvector problem of size 
4jiT X 4nT by defining 


/ ^^^spatial 

-AI 

^^spatial 

0 \ 

A(D'^Patial - I) 

0 

yjjspatial 

0 

^Atemp 

0 

^AtemP 

— 77 I 


0 

rjlpierap _ 

0 / 


where I denotes the nT-dimensional identity matrix; is the adjacency matrix of temporal edges with = 

Suv{St,t'+i + is the diagonal matrix of temporal degrees with = 2 if 0 < t < T, and 1 if t = 0 

ov t = T\ is the nT-dimensional matrix consisting of all the spatial edges, i.e., meaning 

= ^tt'Auv ; = 0^ T)d) is the diagonal matrix of spatial degrees where = Duu- 

We now obtain a spectral clustering algorithm using B' in the following way: given a spatiotemporal graph, we 
construct matrix B', then take vectors composed of first n entries of eigenvectors associated with the largest (absolute) 
eigenvalues, and finally perform fc-means clustering on matrix composed of the vectors. This yields a partition of the 
nodes; if desired number of clusters is two, then we simply use the sign of entries of the vector to separate nodes into 
two communities. 

From the principle of linearization, we know that real eigenvalues of the non-backtracking matrix B' describe 
stability properties of fixed points of the BP equations, i.e., if there is a real-valued eigenvalue larger than unity, it 
represents a stable fixed point in the equations. Moreover, if the BP equations have a stable fixed point, then B' 
should have a real-eigenvalue that is larger than unity, denoting a partition of the nodes that correlates with the 
latent community labels. Thus, our spectral clustering algorithm should work as long as BP works, implying that it 
also works all the way down to the detectability transition in sparse networks. 

In Fig. (left) we show the spectrum of B' in the complex plane for a network in the detectable regime, generated 
by the model. As with existing non-backtracking approaches [21], most of the eigenvalues are confined to a disk, while 
several real eigenvalues fall outside this disk. In this example, entries of the eigenvector associated with the largest 
real eigenvalue have the same sign, hence the leading or “ferromagnetic” eigenvector does not yield information about 
the latent community structure. In practice, we can perform regularizations to push such ferromagnetic eigenvectors 
back into bulk, thereby lifting the eigenvectors correlated with the latent community structure to the top positions. 
Eigenvectors associated with other real eigenvalues outside the bulk are correlated with the latent community structure. 
In this case, because we have two groups, we obtain the inferred partition by using the sign of entries of second real 
eigenvector V 2 - 
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FIG. 3. Heat maps showing the numerically estimated overlap for (left) belief propagation and (right) spectral algorithms. 
The detectability threshold from Eq. 0 is shown as a solid line. Each point shows the average over 100 instances of dynamic 
networks drawn from our model with n = 512, T = 40, A: = 2 groups, and average degree c = 16. 


IV. NUMERICAL VERIFICATION 


To verify our claims of the detectability transition in dynamic networks, and the accuracy of our algorithms, 
we conduct the following numerical experiment. Using our generative model of dynamic networks with community 
structure, we generate a number of dynamic networks for various choices of (e, rj). When e = Cout/cin = 0; communities 
are maximally strong, with every edge being located within a community, while at e = 1, we have Erdos-Renyi random 
graphs with no community structure. We then use our BP or spectral algorithm to infer the group assignments, 
assuming within each sequence that parameters {ry, e, c} are known. For each choice of (e,?y), we average our results 
over 100 dynamic networks with T = 40 graphs and n = 512 nodes (for 20,480 nodes total), with an average degree 
c = 16, divided into k = 2 latent communities. 

We measure the accuracy of the inferred community labels by the overlap between the latent partition g* and the 
inferred one g. This is the fraction of nodes labeled correctly, maximized over all k\ permutations of the groups, 
normalized so that it is 1 if g = g* and 0 if g is uniformly random. In Fig. (right) we show the overlap obtained by 
BP for dynamic networks as a function of e for several choices of rj. The detectability threshold for each rj, from (l7) is 
shown as vertical lines in the lower panel. When 77 = 0, we recover the static detectability threshold given by Eq. (^. 
As we increase g, the phase transition occurs at increasing values of e, as predicted, with the largest increase occurring 
when g = 1. 

Similar results are obtained for other choices of n and T, with better agreement for larger networks. The slight 
deviation between numerical and analytic transition points observable in Fig. [fright is a finite-size effect, which we 
numerically estimated to decrease like 0(^/nT). 

Figure]^ show the overlap throughout the (e,77)-plane, using both BP and spectral algorithms, along with the line 
of the threshold given by Eq. 0. Notably, both algorithms perform similarly: they have large overlap with small e, 
indicating that the learned partition is highly correlated with the latent community structure. As e increases (weaker 
community structure), both algorithms encounter a second-order phase transition in which the overlap decreases from 
a finite value to zero. Separate numerical experiments indicate that the convergence time of BP diverges in the 
vicinity of the phase transition, which agrees with past work on the detectability threshold in static networks [25]. 
We also find that at each point in (e, ? 7 )-plane, the accuracy of BP is always larger than that of the spectral algorithm, 
especially away from the transition, reflecting the optimality of our BP algorithm. 











V. CONCLUSIONS 


We have derived a mathematically precise and general limit to the detectability of communities in dynamic net¬ 
works. This threshold assumes a probabilistic model of community structure that is a special case of several previously 
developed methods to detect dynamic communities: specifically, where nodes may change their community member¬ 
ship over time, but where edges are generated independently at each time step. We also gave two efficient algorithms 
for learning latent community structure that are optimal in the sense that they succeed all the way down to the 
detectability threshold in dynamic networks. 

A simple extension of our algorithm is to apply our BP equations to a dense network consisting of all spatial edges 
from all graphs projected to the time t, handling the message passing over time steps by using a damping factor 
1. This approach extends our analysis to networks that evolve in continuous time rather than in discrete time 

steps. 

For larger numbers of groups, such as fc > 4, it has been conjectured [35] that there is a “hard but detectable” 
regime where the factorized fixed point described in Section [III B| is locally stable, but where one or more accurate 
fixed points exist as well. In such a regime, community detection is information-theoretically possible, but we believe 
that it takes exponential time (though see [31] for the case where the number of groups grows with n). We propose 
this as a direction for further work. 

Other directions for future work include handling cases where the community interaction matrix p may also change 
over time (a situation similar to change-point detection in networks [55]), where edges are not generated independently 
at each time step, or where networks have edge weights |32] or node annotations. 


ACKNOWLEDGMENTS 


The authors thank Elchanan Mossel and Andrey Lokhov for helpful conversations. Financial support for this 
research was provided in part by Grant No. IIS-1452718 (AG, AC) from the National Science Foundation, Grant 
#FA9550-12-1-0432 from the U.S. Air Force Office of Scientific Research (AFOSR) and the Defense Advanced Research 
Projects Agency (DARPA) (LP), and the John Templeton Foundation (PZ, CM). Author order is joint first-authorship 
for AG and PZ, with the remaining authors appearing alphabetically. 


[1] A. Clause! and N. Eagle, in DIMACS Workshop on Computational Methods for Dynamic Interaction Networks (2007) 
arXiv:1211.7343. 

[2] T. Berger-Wolf, C. Tantipathananandh, and D. Kempe, in Link Mining: Models, Algorithms, and Applications (Springer, 
2010) pp. 307-336. 

[3] L. Gauvin, A. Panisson, and C. Cattuto, PLOS ONE 9, e86028 (2014). 

[4] M. Kim and J. Leskovec, in Advances in Neural Information Processing Systems (2013) pp. 1385-1393. 

[5] P. J. Mucha, T. Richardson, K. Macon, M. A. Porter, and J. Onnela, Science 328, 876 (2010). 

[6] R. Rossi, B. Gallagher, J. Neville, and K. Henderson, in Proceedings of the 6th ACM International Conference on Web 
Search and Data Mining (WSDM) (2013). 

[7] E. P. Xing, W. Fu, and L. Song, Annals of Applied Statistics 4, 535 (2010). 

[8] L. Zhu, G. Steeg, and A. Galstyan, arXiv preprint, arXiv:1411.3675 (2014). 

[9] U. Von Luxburg, R. Williamson, and I. Guyon, in ICML Unsupervised and Transfer Learning (2012) pp. 65-80. 

[10] D. Bassett, M. Porter, N. Wymbs, S. Grafton, J. Garlson, and P. Mucha, Ghaos 23, 013142 (2013). 

[11] M. Bazzi, M. Porter, S. Williams, M. McDonald, D. Fenn, and S. Howison, arXiv preprint, arXiv:1501.00040 (2015). 

[12] E. Acar, D. Dunlavy, and T. Kolda, in Data Mining Workshops, 2009. ICDMW’09. IEEE International Conference on 
(IEEE, 2009) pp. 262-269. 

[13] D. Dunlavy, T. Kolda, and E. Acar, AGM Transactions on Knowledge Discovery from Data (TKDD) 5, 10 (2011). 

[14] J. Sun, G. Faloutsos, S. Papadimitriou, and P. Yu, in Proceedings of the 13th ACM SICKDD (AGM, 2007) pp. 687-696. 

[15] M. Rosvall and G. Bergstrom, PLOS ONE 5, e8694 (2010). 

[16] T. Yang, Y. Ghi, S. Zhu, Y. Gong, and R. Jin, in SDM, Vol. 2009 (SIAM, 2009) pp. 990-1001. 

[17] K. Xu and A. Hero, Selected Topics in Signal Processing, IEEE Journal of 8, 552 (2014), 

[18] Q. Han, K. Xu, and E. Airoldi, arXiv preprint, arXiv: 1410.8597 (2014). 

[19] T. Peixoto, arXiv preprint, arXiv:1504.02381 (2015). 

[20] T. Valles-Gatala, F. Massucci, R. Guimera, and M. Sales-Pardo, arXiv preprint, arXiv: 1411.1098 (2014). 

[21] C. Aggarwal and K. Subbian, AGM Gomputing Surveys (GSUR) 47, 10 (2014). 

[22] T. Hartmann, A. Kappes, and D. Wagner, arXiv preprint, arXiv:1401.3516 (2014). 

[23] P. Holland, K. Laskey, and S. Leinhardt, Social Networks 5, 109 (1983). 






9 


[24] K. Nowicki and T. A. B. Snijders, Journal of the American Statistical Association 96 (2001). 

[25] A. Decelle, F. Krzakala, C. Moore, and L. Zdeborova, Physical Review E 84, 066106 (2011). 

[26] E. Mossel, J. Neeman, and A. Sly, Probability Theory and Related Fields , 1 (2012). 

[27] L. Massoulie, in Proc. of the 46th Annual ACM Symposium on Theory of Computing (STOC) (ACM, 2014) pp. 694-703. 

[28] E. Mossel, J. Neeman, and A. Sly, in Proceedings of The 27th Conference on Learning Theory, COLT 2014, Barcelona, 
Spain, June 13-15, 2014 (2014) pp. 356-370. 

[29] F. Krzakala, C. Moore, E. Mossel, J. Neeman, A. Sly, L. Zdeborova, and P. Zhang, Proc. Natl. Acad. Sci. USA 110, 20935 
(2013). 

[30] C. Bordenave, M. Lelarge, and L. Massoulie, arXiv preprint arXiv:1501.06087 (2015). 

[31] S. Janson and E. Mossel, Annals of Probability , 2630 (2004). 

[32] C. Aicher, A. Z. Jacobs, and A. Clauset, Journal of Complex Networks 3, 221 (2015). 

[33] Y. Iba, Journal of Physics A: Mathematical and General 32, 3875 (1999). 

[34] V. Kanade, E. Mossel, and T. Schramm, in Approximation, Randomization, and Combinatorial Optimization. Algorithms 
and Technigues, APPROX/RANDOM 2014, September 4-6, 2014, Barcelona, Spain (2014) pp. 779-792. 

[35] L. Peel and A. Clauset, in 29th AAAI Conference on Artificial Intelligence (2015). 



